Skip to content

feat: add crawling provider#52

Open
nirolevy wants to merge 3 commits intoMapColonies:masterfrom
nirolevy:crawling-provider
Open

feat: add crawling provider#52
nirolevy wants to merge 3 commits intoMapColonies:masterfrom
nirolevy:crawling-provider

Conversation

@nirolevy
Copy link
Copy Markdown

Question Answer
Bug fix
New feature
Breaking change
Deprecations
Documentation
Tests added
Chore

Further information:
Add provider to traverse JSON tilesets to map the FS

@nirolevy nirolevy force-pushed the crawling-provider branch 4 times, most recently from 17d6f58 to 6f5f44b Compare September 16, 2025 10:47
@nirolevy nirolevy force-pushed the crawling-provider branch 2 times, most recently from b7ce5be to 7c4a2b4 Compare September 16, 2025 11:51
Copy link
Copy Markdown
Contributor

@asafMasa asafMasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇 Good job, but I have some remarks, thanks.

@withSpanAsyncV4
public async getFile(filePath: string): Promise<Buffer> {
const logContext = { ...this.logContext, function: this.getFile.name };
const pvPath = this.config.pvPath;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be set once in the Ctor instead of being declared every time this method is called (or you can use 'this.config.pvPath' instead)

Comment thread tests/helpers/s3Helper.ts
}

public async createFileOfModel(model: string, file: string): Promise<void> {
public async createFileOfModel(model: string, file: string): Promise<Buffer> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of creating fake content here and returning it, consider passing an optional argument of data to be written, and keep the method return type as Promise

});

describe('getFile', () => {
it('When calling getFile, should get the file content from pv path', async () => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why didn't you add the test of "it(When the file does not exist in the storage, throws error", same as in the S3 provider? Is there a difference in the expected response from the "getFile" in the provider?

@@ -0,0 +1,41 @@
import config from 'config';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇 Thanks for adding the provider tests

Comment thread config/default.json
},
"crawling": {
"extension": ".json",
"nestedJsonPath": "$.root..uri",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add support for both "uri" and "url" (old legacy 3Dtiles models)

@@ -0,0 +1,137 @@
import fs from 'fs';
import os from 'os';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use from 'node:os';

modelId,
modelName,
});
throw new AppError(StatusCodes.NOT_ACCEPTABLE, 'File is not a valid JSON', false);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you added this status to the valid openAPI status responses?


describe('Bad Path', function () {
// All requests with status code of 400
it('should return 400 status code if a job with same name exists in job manager', async function () {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add test with an invalid file type returns 406 NOT_ACCEPTABLE response

});
throw new AppError(StatusCodes.NOT_ACCEPTABLE, 'File is not a valid JSON', false);
} else {
throw err;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this exception logged?


@injectable()
export class CrawlingProvider implements Provider {
private readonly logContext: LogContext;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that you added another "Provider" for the crawling, but I think the approach should have been to create an abstractBaseProvider for NFS and S3 to inherit from, as we don't want to keep the old behavior.
We want to change it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants